Human Action Recognition using Improved Vector of Locally Aggregated Descriptors

نویسندگان

  • Shi-Ping Yang
  • Jin-Jang Leou
چکیده

Recently, two high-dimensional encoding techniques for human action recognition, namely, Fisher vector (FV) and vector of locally aggregated descriptors (VLAD), are widely employed. In this study, a new human action recognition approach using improved VLAD with localized soft assignment (LSA) and second-order statistics is proposed. When encoding videos into VLAD, instead of considering only the nearest one, we utilize localized soft assignment, i.e., considering multiple nearest visual words. In general, LSA-VLAD captures only the first-order statistics of descriptors and visual words. In this study, LSA and second-order statistics are encoded into VLADlike form, namely, LSA2-VLAD. Based on the experimental results obtained in this study, in terms of average accuracy, the performance of the proposed approach combining LSA-VLAD and LSA2-VLAD is better than those of 10 comparison approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hybrid Super Vector with Improved Dense Trajectories for Action Recognition

With recent improved dense trajectory features (HOG, warped HOF, and warped MBH), we employ two advanced super vector methods, namely Fisher Vector (FV) and soft Vector of Locally Aggregated Descriptors (VLAD-K) to encode them separately. The two individual super vectors are concatenated into a Hybrid Super Vector, and a linear SVM classifier is used to predict labels. We achieve 87.46%1 in ave...

متن کامل

Action recognition via spatio-temporal local features: A comprehensive study

Local methods based on spatio-temporal interest points (STIPs) have shown their effectiveness for human action recognition. The bag-of-words (BoW) model has been widely used and dominated in this field. Recently, a large number of techniques based on local features including improved variants of the BoW model, sparse coding (SC), Fisher kernels (FK), vector of locally aggregated descriptors (VL...

متن کامل

Boosting VLAD with Supervised Dictionary Learning and High-Order Statistics

Recent studies show that aggregating local descriptors into super vector yields effective representation for retrieval and classification tasks. A popular method along this line is vector of locally aggregated descriptors (VLAD), which aggregates the residuals between descriptors and visual words. However, original VLAD ignores high-order statistics of local descriptors and its dictionary may n...

متن کامل

Study of Human Action Recognition Based on Improved Spatio-temporal Features

Most of the existed action recognition methods mainly utilize spatio-temporal descriptors of single interest point ignoring their potential integral information, such as spatial distribution information. By combining local spatio-temporal feature and global positional distribution information (PDI) of interest points,a novel motion descriptor is proposed in this paper. The proposed method detec...

متن کامل

Spatio-Temporal VLAD Encoding for Human Action Recognition in Videos

Encoding is one of the key factors for building an effective video representation. In the recent works, super vector-based encoding approaches are highlighted as one of the most powerful representation generators. Vector of Locally Aggregated Descriptors (VLAD) is one of the most widely used super vector methods. However, one of the limitations of VLAD encoding is the lack of spatial informatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016